Speaker Adaptive Training Localizing Speaker Modules in DNN for Hybrid DNN-HMM Speech Recognizers
نویسندگان
چکیده
منابع مشابه
DNN-Based Speaker Clustering for Speaker Diarisation
Speaker diarisation, the task of answering “who spoke when?”, is often considered to consist of three independent stages: speech activity detection, speaker segmentation and speaker clustering. These represent the separation of speech and nonspeech, the splitting into speaker homogeneous speech segments, followed by grouping together those which belong to the same speaker. This paper is concern...
متن کاملA Speaker Adaptive DNN Training Approach for Speaker-Independent Acoustic Inversion
We address the speaker-independent acoustic inversion (AI) problem, also referred to as acoustic-to-articulatory mapping. The scarce availability of multi-speaker articulatory data makes it difficult to learn a mapping which generalizes from a limited number of training speakers and reliably reconstructs the articulatory movements of unseen speakers. In this paper, we propose a Multi-task Learn...
متن کاملA study of speaker adaptation for DNN-based speech synthesis
A major advantage of statistical parametric speech synthesis (SPSS) over unit-selection speech synthesis is its adaptability and controllability in changing speaker characteristics and speaking style. Recently, several studies using deep neural networks (DNNs) as acoustic models for SPSS have shown promising results. However, the adaptability of DNNs in SPSS has not been systematically studied....
متن کاملPreliminary Work on Speaker Adaptation for Dnn-based Speech Synthesis
We investigate speaker adaptation in the context of deep neural network (DNN) based speech synthesis. More specifically, our current work focuses on the exploitation of auxiliary information such as gender, speaker identity or age during the DNN training process. The proposed technique is compared to standard acoustic feature transformations such as the feature based maximum likelihood linear r...
متن کاملSpeaker Adaptation in DNN-Based Speech Synthesis Using d-Vectors
The paper presents a mechanism to perform speaker adaptation in speech synthesis based on deep neural networks (DNNs). The mechanism extracts speaker identification vectors, socalled d-vectors, from the training speakers and uses them jointly with the linguistic features to train a multi-speaker DNNbased text-to-speech synthesizer (DNN-TTS). The d-vectors are derived by applying principal compo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEICE Transactions on Information and Systems
سال: 2016
ISSN: 0916-8532,1745-1361
DOI: 10.1587/transinf.2016slp0010